AITopics | proprietary data

Collaborating Authors

proprietary data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data

Neural Information Processing SystemsApr-28-2026, 05:42:21 GMT

Table Representation Learning (TRL) models are commonly pre-trained on large open-domain datasets comprising millions of tables and then used to address downstream tasks. Choosing the right TRL model to use on proprietary data can be challenging, as the best results depend on the content domain, schema, and data quality. Our purpose is to support end-users in testing TRL models on proprietary data in two established SQL-centric tasks, i.e., Question Answering (QA) and Semantic Parsing (SP). We present QATCH (Query-Aided TRLChecklist), a toolbox to highlight TRL models' strengths and weaknesses on relational tables unseen at training time. For an input table, QATCH automatically generates a testing checklist tailored to QA and SP. Checklist generation is driven by a SQL query engine that crafts tests of different complexity. This design facilitates inherent portability, allowing the checks to be used by alternative models. We also introduce a set of cross-task performance metrics evaluating the TRL model's performance over its output. Finally, we show how QATCH automatically generates tests for proprietary datasets to evaluate various state-of-the-art models including TAPAS, TAPEX, and CHATGPT.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Y our Data

Neural Information Processing SystemsFeb-12-2026, 17:28:07 GMT

We present QATCH (Query-Aided TRL Checklist), a toolbox to highlight TRL models' strengths and weaknesses on relational tables unseen at training time. For an input table, QATCH automatically generates a testing checklist tailored to QA and SP .

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > Dominican Republic (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre:

Research Report (0.67)
Workflow (0.46)

Industry:

Health & Medicine (0.47)
Information Technology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data

Neural Information Processing SystemsDec-25-2025, 16:22:46 GMT

Table Representation Learning (TRL) models are commonly pre-trained on large open-domain datasets comprising millions of tables and then used to address downstream tasks. Choosing the right TRL model to use on proprietary data can be challenging, as the best results depend on the content domain, schema, and data quality. Our purpose is to support end-users in testing TRL models on proprietary data in two established SQL-centric tasks, i.e., Question Answering (QA) and Semantic Parsing (SP). We present QATCH (Query-Aided TRL Checklist), a toolbox to highlight TRL models' strengths and weaknesses on relational tables unseen at training time. For an input table, QATCH automatically generates a testing checklist tailored to QA and SP. Checklist generation is driven by a SQL query engine that crafts tests of different complexity. This design facilitates inherent portability, allowing the checks to be used by alternative models. We also introduce a set of cross-task performance metrics evaluating the TRL model's performance over its output. Finally, we show how QATCH automatically generates tests for proprietary datasets to evaluate various state-of-the-art models including TAPAS, TAPEX, and CHATGPT.

benchmarking sql-centric task, name change, table representation learning model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.60)

Add feedback

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data

Neural Information Processing SystemsJan-18-2025, 20:21:19 GMT

benchmarking sql-centric task, table representation learning model, trl model, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.63)

Add feedback

Using a Large Language Model to generate a Design Structure Matrix

Koh, Edwin C. Y.

arXiv.org Artificial IntelligenceDec-7-2023

DSM is known for its simplicity and conciseness in representation and exists in the form of a square matrix that maps the relationships between the set of system elements [Yassine and Braha 2003; Browning 2015]. An example DSM (= 4) is shown in Figure 1. Based on the DSM convention described by Browning [2001], Element 1 depends on Element 2 as indicated by a red cell entry in row 2 column 1 of the DSM. Likewise, Element 4 depends on Element 3 as indicated in row 3 column 4. The diagonal of the DSM maps each element to itself and is indicated as black cells in Figure 1. The diagonal is usually left empty but is sometimes used as a space to store element-specific data, such as the likelihood of changing the given element based on market projection [Koh et al. 2013]. The DSM in Figure 1 is not symmetrical across the diagonal, indicating asymmetrical dependencies between the system elements. For example, Element 1 depends on Element 2 but Element 2 does not depend on Element 1. In contrast, the example DSM shows that Element 2 and Element 4 have a symmetrical interdependency. It is important to note that a transposed version of the DSM convention is also widely adopted by many (e.g.

auto-dsm, chatgpt, correctness, (15 more...)

arXiv.org Artificial Intelligence

2312.04134

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Netherlands > Utrecht (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Aerospace & Defense (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Protecting Publicly Available Data With Machine Learning Shortcuts

Müller, Nicolas M., Burgert, Maximilian, Debus, Pascal, Williams, Jennifer, Sperl, Philip, Böttinger, Konstantin

arXiv.org Artificial IntelligenceOct-30-2023

Machine-learning (ML) shortcuts or spurious correlations are artifacts in datasets that lead to very good training and test performance but severely limit the model's generalization capability. Such shortcuts are insidious because they go unnoticed due to good in-domain test performance. In this paper, we explore the influence of different shortcuts and show that even simple shortcuts are difficult to detect by explainable AI methods. We then exploit this fact and design an approach to defend online databases against crawlers: providers such as dating platforms, clothing manufacturers, or used car dealers have to deal with a professionalized crawling industry that grabs and resells data points on a large scale. We show that a deterrent can be created by deliberately adding ML shortcuts. Such augmented datasets are then unusable for ML use cases, which deters crawlers and the unauthorized use of data from the internet. Using real-world data from three use cases, we show that the proposed approach renders such collected data unusable, while the shortcut is at the same time difficult to notice in human perception. Thus, our proposed approach can serve as a proactive protection against illegitimate data crawling.

dataset, ml shortcut, shortcut, (14 more...)

arXiv.org Artificial Intelligence

2310.19381

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Michael Jayawardana on LinkedIn: #bloomberggpt #artificialintelligence

#artificialintelligenceApr-3-2023, 02:55:45 GMT

Bloomberg's announcement that it created a ChatGPT-like large language model focused on finance created a bit of a stir. "BloombergGPT AI may be the harbinger of the next wave of corporate AI," Ethan Mollick, a professor at Wharton, tweeted. He noted that building models is all about the training data and Bloomberg enjoyed the advantage of including proprietary data about finance as well as general information scraped from the Web. Reading the Bloomberg research paper provides some insight into the strange terrain where we find ourselves. Among other things, Bloomberg used a data set called "Enron Emails."

artificialintelligence, bloomberg, michael jayawardana, (8 more...)

#artificialintelligence

Industry: Energy > Power Industry (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

NVIDIA Unveils Large Language Models and Generative AI Service to Advance Life Sciences R&D

#artificialintelligenceMar-22-2023, 17:10:36 GMT

GTC--NVIDIA today announced an expanded set of generative AI cloud services for customizing AI foundation models to accelerate the creation of new proteins and therapeutics, as well as research in the fields of genomics, chemistry, biology and molecular dynamics. Part of NVIDIA AI Foundations, the new BioNeMo Cloud service offering -- for both AI model training and inference -- accelerates the most time-consuming and costly stages of drug discovery. It enables researchers to fine-tune generative AI applications on their own proprietary data, and to run AI model inference directly in a web browser or through new cloud application programming interfaces (APIs) that easily integrate into existing applications. "The transformative power of generative AI holds enormous promise for the life science and pharmaceutical industries," said Kimberly Powell, vice president of healthcare at NVIDIA. "NVIDIA's long collaboration with pioneers in the field has led to the development of BioNeMo Cloud Service, which is already serving as an AI drug discovery laboratory. It provides pretrained models and allows customization of models with proprietary data that serve every stage of the drug-discovery pipeline, helping researchers identify the right target, design molecules and proteins, and predict their interactions in the body to develop the best drug candidate."

bionemo, model and generative ai service, proprietary data, (12 more...)

#artificialintelligence

Country: North America > United States > Illinois > Cook County > Chicago (0.05)

Genre: Research Report > Experimental Study (0.31)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

NVIDIA unveils AI Foundations, its customizable Gen-AI cloud service

EngadgetMar-21-2023, 15:51:15 GMT

The age of enterprise AI has come crashing down upon us in recent months. Public infatuation with ChatGPT since its release last November has opened the floodgates of corporate interest and set off an industry-wide land grab with every major tech entity vying to stake their claim in this burgeoning market by incorporating generative AI features into their existing products. Heavyweights including Google, Microsoft, Meta, and Baidu are already jockeying their Large Language Models (LLMs) for market dominance, while everybody else, from Adobe and AT&T to BMW and BYD, scrambles to find uses for the revolutionary technology. NVIDIA's newest cloud services offering, AI Foundations, will allow businesses lacking the time and money to develop their own models from scratch to "to build, refine and operate custom large language models and generative AI models that are trained with their own proprietary data and created for their unique domain-specific tasks." These models include NeMo, NVIDIA's text-to-image generation engine and DALL-E 2 competitor; BioNemo, a drug and molecule discovery-focused fork of the NeMo model built for the medical research community; and Picasso, an AI capable of generating images, video and "3D applications… to supercharge productivity for creativity, design and digital simulation," according to Tuesday's release.

customizable gen-ai cloud service, nvidia, proprietary data, (12 more...)

Engadget

Genre: Press Release (0.31)

Industry:

Health & Medicine (1.00)
Information Technology > Hardware (0.91)
Information Technology > Services (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Who Will Make Money from the Generative AI Gold Rush? Part I

#artificialintelligenceMar-12-2023, 11:45:11 GMT

BigTech companies already dominate in GenAI infrastructure with their cloud services and hardware chips. Microsoft and Google are well-positioned in the US cloud market, while Baidu and Alibaba are well-positioned in China. Their massive supercomputer cloud infrastructure is engineered to run GenAI's complex, expensive, large text, visual, and audio Foundational Models. There are already many developers using their cloud AI API services and tools to build apps, and this trend is expected to accelerate as entrepreneurs rush to address virtually limitless GenAI use cases. Amazon has been quiet on Foundational Models, so a big question is how will they respond. GenAI uses massive amounts of computational power to generate creative outputs.

bigtech, foundational model, generative ai gold rush, (15 more...)

#artificialintelligence

Country:

Asia > China (0.26)
North America > United States (0.25)

Industry: Information Technology > Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Add feedback